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Executive Summary 



The No Child Left Behind Act of 2001 (NCLB) requires the reporting of participation in 
assessments overall and by subgroup, including students with disabilities. As states and school 
districts strive to meet the goals for adequate yearly progress required by NCLB, the use of 
individual accommodations continues to be scrutinized for effectiveness, threats to test validity, 
and score comparability. This report summarizes 49 empirical research studies completed on 
test accommodations between 2002 and 2004, and provides direction in the design of critically 
needed future research on accommodations. 

NCEO found that studies during this three-year period had the following characteristics: 

Purpose. The primary purpose of the 2002-2004 accommodations research was to determine 
the effects of accommodations use on the large-scale test scores of students with disabilities. 

Types of assessment, content areas, and accommodations. The majority of the studies tested 
students using norm-referenced or criterion-referenced tests, on math or reading/language 
arts. 

Participants. Equal numbers of research studies involved between 1-100 participants, 100- 
1,000 participants, and more than 1,000 participants of multiple age categories. Participants 
were varying percentages of students without disabilities and students with disabilities. Students 
with learning disabilities were studied most frequently among students who receive special 
education services. 

Findings. Findings shared no common theme, with various accommodations shown to have 
both a positive and non-positive effect on scores. Individual accommodations showed either 
differential item functioning or no differential item functioning depending on the study. The 
lack of consistent findings points to a need for further research. 

Limitations. Most often, authors noted that studies were too narrow in scope, involved a small 
sample size, or provided confounding factors. These limitations and other considerations led re- 
searchers to recommend investigating the characteristics of accommodations in further detail. 

Important overall observations from the NCEO analysis include a need in future research for 
a clear definition of the constructs tested, a reduction in confounding factors, increased study 
of institutional factors affecting accommodations judgment, and exploration of the desirability 
and perceived usefulness of accommodations by students themselves. Future research should 
focus on improvement in these areas but also on the positive effects of field-testing potential 
items in accommodated formats in addition to standard formats. 
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Introduction 



Over the past decade, students with disabilities have increasingly participated and performed at 
proficient levels on general education assessments. Participation and proficient performance of all 
students are required by the No Child Left Behind Act of 2001. As participation rates increase, 
so does the use of testing accommodations. Increased use of accommodations should reflect 
an attempt to ensure that the scores received by students with disabilities are valid measures of 
achievement. It is also possible that increased use of accommodations is simply a reflection of 
concern about including students in assessments and a belief that these students need additional 
aids to help them perform better. Because of this, states are clarifying appropriate accommodation 
use in state policy, with the goal of encouraging Individualized Education Program (IEP) teams 
to select accommodations that remove specific disability barriers, but do not give students with 
disabilities an unfair advantage over their peers. States have begun to monitor accommodations 
use, and this will help them to better track the effects of accommodations. In 2005, 20 states 
maintained a database of accommodations actually used during testing within the state, and 
26 states documented the specific accommodations used by students on test day (Thompson, 
lohnstone, Thurlow, & Altman, 2005). 

State policymakers and practitioners define accommodations in a variety of ways. For the pur- 
poses of this report, we draw from accommodations research to help shape our definitions and 
outlook on testing accommodations for students with disabilities. For example, Thurlow and 
Bolt (2001) defined testing accommodations as: 

changes in assessment materials or procedures that address aspects of students’ 
disabilities that may interfere with the demonstration of their knowledge and 
skills on standardized tests. Accommodations attempt to eliminate barriers to 
meaningful testing, thereby allowing for the participation of students with dis- 
abilities in state and district assessments, (p. 3) 

Sireci, Li, and Scarpati (2005) explained the validity of accommodations through an “interaction 
hypothesis” or the theoretical assumption that test accommodations will lead to improved test 
scores for students who need accommodations, but not for students who do not need accom- 
modations (i.e., students with disabilities receive a boost in scores as a result of accommoda- 
tions whereas students without disabilities do not receive a boost or receive a less pronounced 
boost in scores). 

The accommodations allowed in state assessment policies vary from state to state. The most 
common accommodations found in previous syntheses of research were read aloud accommoda- 
tions (sometimes referred to as oral administration), computer administration of tests, extended 
time and tests across multiple days, calculator use, and use of a scribe (Thompson, Blount, & 
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Thurlow, 2002). Most students use a combination of accommodations (Bielinski, Ysseldyke, 
Bolt, Friedebach, & Friedebach, 2001), as dictated by their IEPs. 

Research on accommodations has yielded mixed results in terms of validity and efficacy of pro- 
viding them to students with disabilities. Over the past several years, accommodations research 
has yet to produce definitive answers for policymakers and practitioners (Sireci, Li, & Scarpati, 
2003). Nevertheless, research reports on accommodations continue to be found in professional 
journals, indicating that there is still a need to investigate the complex issues surrounding ac- 
commodations. This report examines empirical research published between 2002 and 2004. 
We searched peer-reviewed articles, technical reports, and dissertations in order to provide the 
readers with up-to-date information on accommodations. 

In this report, research is summarized according to several components, including research pur- 
pose, type of assessment, content area assessed, type of accommodation, number of participants, 
percent of sample consisting of students with disabilities, participant grade level, type of dis- 
ability, research results, research limitations, and recommendations for further research. During 
our review process, we found 49 published studies on accommodations between 2002-2004. This 
number reflects a high number of studies conducted in the new millennium, and is slightly more 
than the number of studies published between 1999 and 2001 (Thompson, Blount, & Thurlow, 
2002) (see Table 1). The publications reviewed for this report are found in Appendix A. 

Table 1. Number of Accommodations Studies by Years 



Years 


Number of Studies 


1990 through 1992 


11 


1993 through 1995 


18 


1996 through 1998 


29 


1999 through 2001 


46 


2002 through 2004 


49 



Methods 

NCEO used a four-stage process to find publications related to accommodations from 2002 
through 2004. First, we conducted a search of electronic databases including ERIC, Psychlnfo, 
Educational Abstracts, and Digital Dissertations using the keywords “accommodation,” “test 
adaptation,” “test changes,” “test modifications,” “test accommodations,” “state testing accom- 
modations,” “standards-based testing accommodations,” and “large-scale testing accommoda- 
tions.” 
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A second electronic search consisted of organizational Web sites, including Behavior Research 
and Training (http://brt.uoregon.edu/), the National Center for Research on Evaluation, Stan- 
dards, and Student Testing (http://www.cse.ucla.edu/), the Center for the Study of Assessment 
Validity and Evaluation (http://www.c-save.umd.edu/index.html), and the Wisconsin Center 
for Educational Research (http://www.wcer.wisc.edu/tesacc/). In addition, an archival search 
of Educational Policy Analysis Archives was undertaken. 

In addition to the electronic searches, NCEO staff performed two hand searches. First, refer- 
ences from all selected materials dated 2002 through 2004 were examined in an effort to find 
further source material. Second, 2002 through 2004 issues of major measurement and special 
education journals were hand searched in the University of Minnesota library. These journals 
included Applied Measurement in Education, British Journal of Special Education, Diagnostique, 
Educational Assessment, Educational and Psychological Measurement, Educational Measure- 
ment: Issues and Practice, Educational Psychologist, Educational Psychology, Exceptional 
Children, Journal of Educational Measurement, Journal of Learning Disabilities, Journal of 
School Psychology, Journal of Special Education, Remedial and Special Education. Last, the 
schedules of the annual conferences of major organizations (such as the American Educational 
Research Association, the Council of Chief State School Officers, Council for Exceptional 
Children, and the National Council on Measurement in Education) were scanned for presenta- 
tions on accommodations. 

NCEO research staff searched all identified sources over an 18-month period beginning in fall, 
2004 and concluding in spring, 2005. All of the studies cited were either empirical research or 
meta-analyses, each with succinct research findings that added to the field’s knowledge about 
the effects of accommodations. The References section of this report includes all journal article, 
research report, conference presentation, and dissertation references. 

Results 

Purpose of Accommodations Research 

Two primary purposes appeared most often in accommodations research from 2002 through 
2004 (see Table 2). First, the majority of studies examined the effect of the use of accommoda- 
tions on scores; 23 studies sought to determine the effect of the use of accommodations on test 
scores with students with disabilities, and 13 studies investigated the effects of accommoda- 
tions on test score validity. Second, a set of studies seemed to reflect the purpose of looking at 
accommodations institutional factors (such as teacher knowledge, effects of policy, and IEP 
team decision making; nine publications fit this purpose. In addition to these, two publications 
examined patterns of errors across items or tests and two meta-analyses synthesized accom- 
modations studies. Details of the studies according to purpose are provided in Appendix A. 
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Table 2. Research Purposes 



Research Purpose 


Number of Studies 


Determine the effect of the use of accommodations on test scores 
of students with disabilities 


23 


Investigate the effects of accommodations on test score validity 


13 


Study institutional factors, teacher judgment, or student 
desirability of accommodation use 


9 


Examine patterns of errors across items or tests 


2 


Meta-analysis 


2 



Authors employed a variety of methods in their research (see Table 3). The most common 
methods were experimental and quasi-experimental research, in which research participants 
took tests under different conditions, and reviews of extant data, in which researchers reviewed 
data from assessments that students took with or without accommodations that were not 
specifically designed for experimental and control conditions. Twenty-one publications used 
experimental design in their methodology and 17 studies reviewed existing data from state 
large-scale assessments or local assessments. A variety of descriptive and comparative statistics 
were employed to examine extant data. In addition to these methods, seven additional studies 
used survey or interview methods to better understand stakeholder understanding of opinions 
on accommodations. Two studies were meta- analyses, one study evaluated a product and one 
study described interventions for IEP teams. 

Table 3. Research Methods 



Method 


Number of Studies 


Experimental or Quasi-experimental 


21 


Review of extant data 


17 


Survey/Interview 


7 


Meta-analysis 


2 


IEP intervention 


1 


Product evaluation 


1 



Note: Studies are described by their primary methodology. 



The experimental and extant data analysis methods combined had two main goals: understanding 
the effect of accommodations on test scores and understanding the effects of accommodations 
on the psychometric qualities of items. Among the 23 studies that examined the effects of ac- 
commodations on test scores (see Table 2), researchers found that computerized administra- 
tion (Pomplun, Frey, & Becker, 2002), read-aloud accommodation (Helwig, Rozek-Tedesco, 
& Tindal, 2002; Meloy, Deville, & Frisbie, 2002), video administration (Burch, 2002; Tindal, 
2002), extended time (Bridgeman, Cline, & Hessinger, 2004), and assistive technology (Fan- 
dau, Russell, Gourgey, Erin, & Cowan, 2003; Mac Arthur & Cavalier, 2004) all had a positive 
effect on the test scores of at least some of the students with disabilities included in the research 
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samples. Conversely, Burch (2002) and Barton (2002) found that some students with learning 
disabilities did not benefit from computer or video accommodations. Likewise, Schuneman, 
Camara, Cascallar, Wendler, and Lawrence (2002) found that calculator usage did not have an 
effect on student scores. According to Elliott and Marquart (2003, 2004), extended time also 
did not yield improved scores for students with disabilities. 

In terms of the psychometric properties of accommodations, researchers obtained mixed re- 
sults in terms of score comparability for items. Barton (2002), Barton and and Huynh (2003), 
Calahan, Mandinach, and Camara (2002), Huynh, Meyer, and Gallant-Taylor (2002), and Ko- 
brin and Young (2003) found no change in item comparability when various accommodations 
(including read aloud, extended time, and computerized administration) were employed. Bolt 
and Bielinski (2002), Choi and Tinker (2002), and Thornton, Reese, Pashley, and Dalessandro 
(2002), however, all found that tests administered orally, with extended time, or via computer 
changed item difficulty or constructs. 

Finally, findings related to the institutional issues around accommodations were also mixed. 
Differing foci for studies yielded different results. Six studies of teachers, administrators, and 
students yielded contrasting results on the relative knowledge of school personnel about ac- 
commodations. Cisar (2004) found that special education teachers were more knowledgeable, 
while Gagnon and McLaughlin (2004) found that teachers and administrators scored similarly 
on knowledge measures. Woods (2004) discovered that students do not often predict their need 
for accommodations well. 

Types of Assessment 

Researchers who studied the effects of accommodations used two main types of assessment 
to determine effects and error in the use of accommodations in 2002-2004. This information 
is shown in Table 4. One common approach to testing the effects of accommodations was for 
researchers to use norm-referenced tests. Education professionals typically use norm-referenced 
tests for national comparison, diagnostic decisions in schools, and for college entrance decisions. 
Researchers examined the effects of accommodations on the following norm-referenced tests: 
California Achievement Tests (CAT), Graduate Record Exam (GRE), Law School Admission 
Test (LSAT), Nelson-Denny Reading Rest, Scholastic Aptitude Test (SAT), Terra Nova, and the 
General Certificate of Secondary Education Examination-United Kingdom (GCSE-UK). 

In addition, researchers employed a number of statewide criterion-referenced tests (including 
tests from Maryland, Missouri, Oregon, and South Carolina). One researcher gathered descrip- 
tive data from a test that was still in the prototype stage. Appendix B provides details of the 
assessments used in the research during the years 2002-2004. 
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Table 4. Type of Assessment 



Type of Assessment 


Number of Studies 


Norm-referenced and Other Standardized Tests 


18 


State Criterion-referenced Tests or Performance Assessments 


18 


School or District-designed Tests 


0 


Other 


8 


Survey 


3 


N/A, Meta-Analyses 


2 



Content Areas Assessed 

Authors of studies conducted research in five major academic content areas: reading (n=23), 
mathematics (n=21), science (n=3), writing (n=3), and social studies (n=l) (see Table 5). There 
were also seven studies that examined test accommodations, but not in a specific content area. 
The “no specific content area” studies included examinations of test accommodations on general 
academic assessments, but did not include surveys, or meta-analyses (n=5). Appendix C gives 
additional details on the content area examined in research studies (11 studies included two or 
more content areas). 

Table 5. Content Areas Assessed* 



Content Areas Assessed 


Number of Studies 


Mathematics 


21 


Reading/Language Arts 


23 


Science 


3 


Writing 


3 


Social Studies 


1 


No Specific Content Area 


7 



*Studies may have reported on multiple content. 



Type of Accommodation 

We found 15 types of accommodations in the research literature from 2002 through 2004 (see 
Table 6). Four groups of accommodations emerged: presentation (n=21), timing/scheduling 
(n=8), response (n=2), and technological aid (n=2) (see also Appendix D). In addition, 11 of 
the studies investigated the effects of multiple accommodations. 

Presentation accommodations were investigated most frequently in the research from 2002-2004. 
Among publications about presentation accommodations, studies about oral administration of 
tests (read aloud accommodations) were most common (n=l 1). The use of computers as a testing 
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accommodation was also common (n=5). In addition, researchers investigated video administra- 
tion of tests (n=2), large print accommodations (n=l), dictionary use (n=l), and braille formats 
of tests (n=l). Additional studies examined multiple types of accommodations. 

Of the eight studies that explored the timing or scheduling of tests, seven studies investigated 
extended time and one study examined the outcomes of testing over multiple days. Among the 
two studies about response formats, one study examined student dictated response. In addition, 
one studied the use of calculators. Two studies investigated technological aids. 

Eleven studies examined the effects of multiple accommodations. Studies of multiple accom- 
modations included combinations of accommodations such as read aloud, video presentation, 
extra time, large print, individual settings or small group settings. 

Table 6. Type of Accommodation 



Type of Accommodation 


Number of Studies 


Presentation (21): 


Oral Administration 


11 




Computer Administration 


5 




Video 


2 




Large Print 


1 




Dictionary Use 


1 




Braille 


1 


Timing/Scheduling (8): 


Extended Time 


7 




Multiple Day 


1 


Response (2): 


Dictated Response 


1 




Calculator 


1 


Technological Aid (2) 




2 


Multiple Accommodations (11) 




11 


N / A (Survey or Meta-Analysis) (5) 




5 



Research Participants 

Studies varied in the number of participants included in the sample. Table 7 shows the number 
of research participants in studies reflected in intervals of 100 participants. Eight high school 
and college students participated in the smallest study (Landau et al., 2003), and Hall (2002) 
used test data from 192,000 students in the largest study. Full details of research participant 
numbers is provided in Appendix E. Overall, approximately one-third of the studies had 99 or 
fewer participants (n=18); approximately one-third had between 100-999 participants (n=13), 
and approximately one-third of studies has more than 1,000 research participants (n=16). 
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Table 7. Number of Participants in Studies 



Number of Participants 


Number of Studies 


1-99 


18 


100-199 


2 


200-299 


3 


300-499 


4 


500-999 


4 


More than 1000 


16 


Not Applicable 


2 



The percentage of research participants with disabilities differed across studies (see Table 8). 
In 18 studies, students with disabilities made up a majority of the sample (participants in eight 
studies were 50-74 percent students with disabilities and the samples in ten studies were 75-100 
percent students with disabilities). In 15 studies, students with disabilities comprised less than 
half of the sample (there were less than 25 percent students with disabilities in 10 studies and 
between 25 and 49 percent students with disabilities in the sample of five studies). Nine studies 
did not report the percentage of the sample with disabilities and seven studies (including two 
meta-analyses) did not use research methods that involved students. 

Table 8. Percent of Sample Consisting of Students with Disabilities 



Percent of Sample Consisting of Students with Disabilities 


Number of Studies 


1-24% 


10 


25-49% 


5 


50-74% 


8 


75-100% 


10 


Not Reported 


9 


No Students with Disabilities Participated in Study 


0 


Not Applicable 


7 



The grade level of research participants also varied (see Table 9). For example, six studies targeted 
students who were in elementary school (grades K-5), six studies examined accommodations 
with middle school students (grades 6-8), and 11 studies examined accommodations with high 
school students (grades 9-12). Postsecondary students participated in six studies and 15 studies 
investigated students across grade levels (from grades K to 12). Five studies employed surveys 
or were meta-analyses that did not involve actual research participants. 
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Table 9. Grade Level of Participants in Studies 



Participant Grade Level 


Number of Studies 


Elementary (K-5) 


6 


Middle School (6-8) 


6 


High School (9-12) 


11 


Multiple Grade Level Categories (K-Postsecondary) 


15 


Post Secondary 


6 


Not Applicable 


5 



In addition to differences in grade level, students with a variety of disability labels participated 
in studies (see Table 10). Several studies (n=16) included more than one disability category. 
Fifteen studies included students with learning disabilities, 12 studies included students with 
communication disabilities, 10 studies included students with cognitive disabilities, and 10 
studies included students with emotional/behavioral disabilities. Students with less common 
disabilities, including physical impairment, sensory disabilities, autism, attention deficit dis- 
order, health impairments, and multiple disabilities, were each included in at least one study. 
Twenty-one studies did not report the types of disabilities of participants. 

Table 10. Disability Categories Included in Studies 



Type of Disability 


Number of Studies 


Learning Disability 


15 


Cognitive Disability (e.g., mental retardation) 


10 


Emotional/Behavioral Disability 


10 


Communication Disability 


12 


Reading or Math Deficit 


4 


Other (includes physical and sensory disabilities, autism, attention 
deficit disorder, health impairments, and multiple disabilities) 


16 


Not Reported 


21 


Not Applicable 


5 



Note: Studies sometimes include students with more than one disability category; all are reflected in this table. 



Research Results 

Results from the 49 studies reviewed in this synthesis varied. Researchers found accommodations 
showed both statistically positive and statistically non-significant effects on scores. Likewise, 
some accommodations had no effect on item comparability, while other types of accommoda- 
tions compromised item comparability. 
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Presented here are results according to the type of accommodation used. These results are shown 
in Table 11, with detailed results available in Appendix F. Similar to previous reviews of ac- 
commodations research (Thompson, Blount, & Thurlow, 2002), accommodations appeared to 
have mixed effects in studies from 2002 through 2004. Furthermore, even when separated by 
type of accommodation, studies still demonstrated mixed effects. The results from the 1 1 stud- 
ies that investigated multiple accommodations are not synthesized due to lack of study focus 
comparability. 

Oral Presentation (Read Aloud). A total of 11 studies on oral administration of assessments 
(often called “read aloud” accommodations) produced mixed results. For example, Helwig 
et al., (2002), Huynh, Meyer, and Gallant (2004), Janson (2002), Meloy, Deville, and Frisbie 
(2002), Tindal (2002), and Weston (2003) all found that read aloud accommodations had a 
positive effect on scores for students with disabilities. Bolt and Bielinski (2002) and McKevitt 
and Elliott (2003), however, found read aloud accommodations had no significant impact on 
student scores. 

In terms of item comparability, Barton (2002) and Barton and Huynh (2003) found that items 
were comparable, whether presented under standard conditions or with read aloud accommoda- 
tions. However, Bolt and Bielinski (2002), Meloy et al. (2002), and Weston (2003), found that 
read aloud accommodations did affect item comparability. 

The determination of who should use oral presentation accommodations is also an issue. Woods 
(2004) found great inaccuracy in self-prediction for the need of the read aloud accommodation. 
Such disparate findings point to the on-going controversies regarding read-aloud accommoda- 
tions. 

One study examined the impact of a student-reads-aloud (i.e., student reads text but aloud) ac- 
commodation on the performance of middle and high school students with and without learning 
disabilities on a test of reading comprehension. Elbaum, Arguelles, Campbell, and Saleh (2004) 
discovered that students’ test performance did not differ in the two conditions, and students 
with learning disabilities did not benefit more from the accommodation than students without 
learning disabilities. 

Extended Time. Authors of eight studies examined how the use of extended time affected stu- 
dent achievement levels on tests and the extent to which items under extended time or multiple 
day administrations of tests compared to those administered under standard conditions. Several 
studies (Bridgeman et al., 2004; Dempsey, 2004) found that students with disabilities profit 
from extended time accommodations. In these studies, students with disabilities had higher test 
scores because of extended time accommodations. 
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Buehler (2002) and Elliott and Marquart (2004), however, found no significant effect on scores 
when students were provided extended time. Such disparities in study results again demonstrate 
the lack of consistency in accommodations research. The comparability of specific items under 
different administration categories further complicates accommodations issues. Buehler (2002) 
found varying student results for items that were deemed to be comparable under standard and 
extended time administrations, but Elliott et al. (2004) and Thornton et al. (2002) found that the 
items they studied were not comparable under different administrations. A study by Crawford, 
Helwig, and Tindal (2004) of multi-day testing produced conflicting results. 

Computer Administration. In total, five studies investigated the use of computer administered 
tests from 2002-2004. Among these studies, Pomplun, Frey and Becker (2002) found that stu- 
dents had positive test results when administered tests via computer (rather than paper and pencil 
format). Barton and Huynh (2003), Bridgeman, Lennon and Jackenthal (2003), and Kobrin and 
Young (2003), however, found that computer administration of tests had no statistical effect, or 
a statistically negative impact on student scores. Related studies on item comparability between 
computer and paper/pencil administration yielded similar results, with Choi and Tinker (2002) 
determining that items were changed as a result of format differences. 

Technological Aid. In the years 2002 through 2004, three research studies investigated technol- 
ogy-based testing accommodations, and all resulted in positive effects on test scores. Hansen, 
Lee, and Forer (2002) found that the usability of speech output technology was evaluated posi- 
tively, and that ‘self-voicing’ testing systems have significant potential and may be capable of 
replacing human readers in certain testing situations. Landau et al. (2003) found that the Tactile 
Text Tablet, a hybrid between a braille paper-based test and laptop computer, had positive effects 
on student achievement. Mac Arthur and Cavalier (2004) found that the use of speech recogni- 
tion software was feasible, created impressive dictation results, and improved the quality of 
student- written essays. 

Calculator Use. There was only one study between the years of 2002-2004 that considered 
calculator use. In this study Scheuneman et al. (2002) found that calculator use had no signifi- 
cant effect on scores for students taking the SAT. It is unknown why research in calculator us- 
age has become less common than in past years (there were four research studies on calculator 
accommodations in the years 1999-2001 but only one accommodations study during the years 
of 2002-2004). 

Dictionary Use. One international study of dictionary accommodations took place from 2002 
through 2004. Idstein’s (2003) study of Israeli students found that dictionaries were not an ef- 
fective accommodation and that dictionary use interrupted student thought patterns. 
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Table 11. Types of Accommodations in Studies 



Type of Accommodation 


Research Results 


Number of Studies 


Oral Presentation (11) 


Positive effect on scores 


6 




No Differential Item Functioning 


2 




No significant effect on scores 


2 




Differential Item Functioning 


2 




Self-prediction for need unreliable 


1 


Extended Time (1 2) 


Positive effect on scores 


3 




No significant effect on scores 


2 




Differential Item Functioning 


2 




No Differential Item Functioning 


1 




Scores on accommodated test 
predictor of grades 


1 


Computer Administration (5) 


No significant effect on scores 


3 




Positive effect on scores 


1 




Differential Item Functioning 


1 


Technological Aid (3) 


Positive effect on scores 


3 


Calculator Use (1) 


No significant effect on scores 


1 


Dictionary Use (1) 


Negative effect on Scores 


1 


Multiple Accommodations 




11 


N/A, Meta-analyses, Survey, Teacher 




5 



Limitations 

Educational research has inherent limitations that require readers to consider findings carefully. 
For example, true experimental conditions rarely mimic the true conditions in schools under 
“live” testing conditions. In addition, sample sizes for studies of students with disabilities are 
often small because students with disabilities are a minority population in schools (roughly one 
in 10 students has a disability). 

Thirty-six authors (74 percent) of accommodation studies published from 2002 through 2004 
noted limitations in their studies. Table 12 provides tabular information on most commonly found 
limitations, and Appendix G provides brief annotations of studies that reported limitations. 

Fifteen authors reported a small or narrow sample size, including Hall (2002) who noted, “The 
study focuses only on fifth grade students, and the results may not generalize to students with 
disabilities in other grades or dissimilar disabilities, socio-economic statues, etc.” Thirteen 
authors warned that confounding factors may have influenced results. Common factors were 
testing multiple accommodations at once and an inability to randomize the sample. Four authors 
found a flaw in research design that affected study results. Two authors each listed conflicting 
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results or nonstandard administration across proctors and schools as a limitation. Eleven authors 
did not mention a limitation nor did the two meta-analyses. 

Table 12. Limitations of Research 



Research Limitation 


Number of Studies 


Small Sample Size/Sample Too Narrow in Scope 


15 


Confounding Factors 


13 


Flaw in Research Design 


4 


Conflicting Results 


2 


Nonstandard Administration Across Proctors and Schools 


2 


No Limitations Mentioned 


12 


Not Applicable/Meta-Analyses 


2 



Recommendations for Future Research 

From 2002 through 2004, 34 research studies included recommendations for further research. 
Table 13 represents the categories of recommendations listed by researchers, with more detailed 
explanations available in Appendix H. Calls for further investigation demonstrate the continuing 
investigatory nature of accommodations research. Although scholars have conducted accommo- 
dations research for several decades, there is still a clear need for more examination in various 
areas. In studies conducted from 2002 through 2004, authors suggested there was need for further 
understanding of the factors of accommodations that contribute to possible variation in results, 
further understanding of student factors that contribute to accommodations use and success, 
improved study design or replication of studies, further research on accommodations policy and 
overall hypotheses, replication of studies due to small sample sizes, investigation into teacher 
characteristics that relate to accommodation selection, investigation into the possible uses for 
accommodations in instruction, and investigation into accommodation use practicality. 

Table 13. Recommendations for Future Research 



Recommendations 


Number of Studies 


Investigate characteristics of accommodations themselves in 




further detail 


12 


Investigate student factors contributing to accommodations use 


6 


Improved study design or study replication 


5 


Study policy and accommodations hypotheses 


4 


Replicate study with larger sample 


3 


Investigate teacher factors related to accommodations selections 


2 


Investigate possible instructional uses for accommodations 


1 


Investigate accommodation practicality 


1 
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Discussion and Implications for Future Research 



Several themes arose from the 49 studies of accommodations published between 2002 and 2004. 
One theme was that accommodations research is inconclusive. This is similar to past findings 
from NCEO summaries of research (Thompson, Blount, & Thurlow, 2002). Given that there is 
not a preponderance of evidence concerning accommodations, nearly two-thirds of the authors 
(n=31) suggested that future research is needed to solidify understanding of accommodation 
effects. 

Researchers published three more accommodations studies from 2002 through 2004 than from 
1999 through 2001 . Similar to previous years, the majority of studies in the most recent period 
focused on test scores of students with disabilities related to accommodations. A significant num- 
ber of studies also investigated the effects of accommodations on test score validity. Researchers 
were particularly concerned about accommodations changing the construct of the items assessed. 
Similar to previous reviews, a smaller number of studies from 2002 through 2004 concentrated 
on institutional factors related to accommodation use (e.g., teacher judgment, student selection, 
policy), on patterns of errors across items, or were meta-analyses of previous work. 

The assessments that researchers selected for examination were primarily standardized, norm- 
referenced tests and performance assessments. As would be expected with assessment require- 
ments of the No Child Left Behind Act of 2001 , most accommodations research was conducted 
in the areas of reading/language arts and mathematics. Across subject areas, researchers studied 
oral administration and timing accommodations most often. 

Sample sizes varied, with approximately equal representation of small (n=99 or fewer subjects), 
medium (n=100-999 subjects), and large (n>l,000 subjects). Studies with low sample sizes 
typically targeted students with disabilities in K-12 schools while large studies typically used 
tests administered to large numbers of students (such as college entrance examinations). De- 
spite the uniform nature of sample sizes, there was less uniformity in terms of the grade-level 
of participants. Accommodations studies were spread across grade and educational level, but 
were conducted primarily with secondary and post- secondary education students. 

Finally, in terms of demographics, researchers most often studied students with learning 
disabilities from 2002 through 2004. Eight studies explicitly targeted students with learning 
disabilities. This pattern mirrors accommodations patterns from 1999-2001 (Thompson, Blount, 
& Thurlow, 2002) most likely because students with learning disabilities are the largest group 
of students with disabilities and because this population is frequently assigned accommodations 
such as oral administration and extended time. The practical value of studying these accom- 
modations with students with learning disabilities is obvious, and was evident in research from 
2002 through 2004. 
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Although this report is not meant to provide a scientific meta-analysis of accommodations re- 
search (for meta-analyses see Sireci et al., 2005 and Tindal and Ketterlin-Geller, 2004), general 
patterns that emerged from a review of accommodations research in the years 2002, 2003, and 
2004 indicate possible considerations for future research. 

The majority of research concentrated on the effect of accommodations use for students with 
disabilities and the effects on score validity due to accommodations use. Although there were 
36 studies combined that investigated scoring and validity, there was little consensus among 
researchers. Findings continue to be contradictory. Research indicated that accommodations 
were either beneficial or not beneficial for students with disabilities. Likewise, researchers did 
not reach consensus on whether accommodations change the construct of the item assessed. 
Because findings are relatively disparate, there does appear to be a need for further research. 

Research that continues to delineate the “interaction hypothesis” (Sireci et al., 2005) and that 
reduces construct irrelevant variance for students with disabilities without introducing any new 
effects for non-disabled students still appears to be necessary. In 2002-2004, 21 studies employed 
experimental or quasi-experimental methods. Replications of scientific methods to discover the 
effects of accommodations may help the field to better understand how accommodations effect 
scoring and validity. 

While scientific research holds great importance for scoring and validity issues, the variety of 
research conducted over the course of 2002-2004 is also important. Studies in 2002, 2003, and 
2004 investigated accommodations score effects, validity, teacher decision-making in terms of 
accommodations, accommodations effects for students from grades K-postsecondary educa- 
tion, and across five different subject areas. In addition to more studies on these topics, future 
research should investigate the positive effects of field-testing potential test items in accom- 
modated formats in addition to standard formats. 

Although the diversity of studies about accommodations presents a challenge to policymakers 
who may wish to have definitive conclusions about accommodations, the breadth of studies 
reflects (at least to some extent) the variety of issues present in education today. Students with 
disabilities are not a homogeneous group. Likewise, one accommodation does not fit all students, 
especially students at different grade or educational levels. 

The move toward more universally accessible assessments provides an opportunity to minimize 
the need for accommodations. Thompson, Johnstone, and Thurlow (2002), however, noted that 
flexible, universally designed assessments may only minimize, not completely diminish, the 
need for accommodations. Likewise, until there is an individualized system for validly choos- 
ing accommodations there is a continued need for research on teacher decision-making related 
to accommodations. 



NCEO 



15 





Although accommodations research has been a part of educational research for decades, it appears 
that it is still in its nascence. There is still much scientific disagreement on the effects, validity, 
and decision-making surrounding accommodations. Such challenges lead to difficult decisions 
for future accommodations research. Scientific studies with large sample sizes hold promise 
for determining the exact effect that researchers can derive from particular accommodations. 
Likewise, tests that are norm-referenced and statistically defensible (such as standardized tests) 
lead to claims of effects on items that are more significant. 

Unfortunately, much is lost on studies such as those presented above. Students with disabilities 
are a heterogeneous group that may require a wide variety of accommodations in order to ac- 
cess tests. Research from 2002-2004 was most focused on oral administration and extended 
time conditions for students with learning disabilities. Research related to issues for students 
with learning disabilities is meaningful (given the large numbers of students with learning dis- 
abilities), but research should not excessively focus on the needs of the most populous disability 
group. Rather, research on students with a wide variety of disabilities receiving a wide variety 
of accommodations is also a valuable focus, even when statistical claims are more difficult to 
generate. 

As testing technology emerges in the 21 st century, further research will need to address flexible 
tests that allow on-demand accommodations. Findings from 2002-2004 demonstrate only that 
questions still abound concerning accommodations, and that answers are often circumstantial, 
population-dependent, or constrained to particular tests. Such findings justify the need for tests 
that diminish the need for accommodations, but also more research on accommodations that 
currently exist. As we move forward into the next generation of accommodations research, one 
fact is certain: so long as policies (such as the Individuals with Disabilities Education Act and 
the Americans with Disabilities Act) require that students with disabilities receive accommoda- 
tions, there will always be a need for research on how to best, most fairly, and most accurately 
assess all students. 
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Appendix A — Summary of Research Purpose 



Determine the effect of the use of accommodations on test scores of students with 
disabilities 



Bolt S., & Bielinski J. (2002). The effects of 
the read aloud accommodation on math test 
items. Paper presented at the annual meeting 
of the National Council on Measurement in 
Education, New Orleans, LA. 


The purpose of this study was to examine 
whether score comparability improves when a 
test is read aloud to students. 


Buehler K. L. (2002). Standardized group 
achievement tests and the accommodation of 
additional time (Doctoral dissertation, Indiana 
State University, 2001). Dissertation Abstracts 
International , 63/04, 1312. 


This study investigated the effects of additional 
time on test data. 


Burch M. (2002). Effects of computer-based 
test accommodations on the math problem- 
solving performance of students with and 
without disabilities (Doctoral dissertation, 
Vanderbilt University, 2002). Dissertation 
Abstracts International , 63/03, 902. 


This study investigated three different 
computer-based testing accommodations. 
Students were tested in the following 
conditions: Standard administration 
(SA), computer-read text (CRT), video 
(V), constructed responses (CON), and 
comprehensive accommodations (CA). 


Crawford, L., Helwig, R., & Tindal, G. (2004). 
Writing performance assessments: How 
important is extended time? Journal of 
Learning Disabilities, 37(2), 132-142. 


This study investigated the effects of varying 
the available amounts of testing time on the 
writing performance of students in general and 
special education at Grades 5 and 8. 


Barton K. E., & Sheinker A. (2003). 
Comparability and accessibility: On line versus 
on paper writing prompt administration and 
scoring across students with various abilities. 
Monterey, CA: CTB-McGraw-Hill. 


The purpose of this study was to examine 
whether students without disabilities obtain 
higher scores than those with disabilities when 
all participants were administered two writing 
prompts (on-line and paper-based) counter- 
balanced by mode of administration and 
prompt. 


Dempsey, K. M. (2004). The impact of 
additional time on LSAT scores: Does time 
really matter? The efficacy of making decisions 
on a case-by-case basis (Doctoral dissertation, 
La Salle University, 2004). Dissertation 
Abstracts International, 64/10, 5212. 


This study examines the relationship between 
cognitive test data and Law School Admission 
Test (LSAT) performance as well as the 
effects of being granted additional test time 
to take the LSAT. This study also evaluated 
the difference between candidates’ standard 
and accommodated LSAT scores and their 
predicted LSAT scores. 


Elbaum B., Arguelles M. E., Cambpell Y., & 
Saleh M. B. (2004). Effects of a Student- 
Reads-Aloud Accommodation on the 
Performance of Students With and Without 
Learning Disabilities on a Test of Reading 
Comprehension. Exceptionality, 12(2), 71-87. 


This study examined the impact of a student- 
reads-aloud (i.e., student reads text but 
aloud) accommodation on the performance 
of middle and high school students with and 
without learning disabilities on a test of reading 
comprehension. 
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Elliott S. N., & Marquart A. M. (2003). Extended 
time as an accommodation on a standardized 
mathematics test: An investigation of its effects 
on scores and perceived consequences for 
students with varying mathematical skills. 
Madison, Wl: University of Wisconsin-Madison, 
Wisconsin Center for Education Research. 


This study investigated the significance on 
scoring for students who took equivalent forms 
of a standardized math test in two conditions 
(extended time and standard time). 


Elliott, S. N., & Marquart, A. M. (2004). 
Extended time as a testing accommodation: 
Its effects and perceived consequences. 
Exceptional Children, 70(3), 349-367. 


This investigation examined the effect of 
extended time on the performance of students 
with disabilities, students educationally at risk 
in math, and students without disabilities. 


Hansen E. G., Lee M. J., & Forer D. C. (2002). 
A ‘self-voicing’ test for individuals with visual 
impairments. Journal of Visual Impairment and 
Blindness, 96(4), 273-275. 


The study investigated the use of speech 
output technology for tests for individuals with 
visual impairments. 


Helwig R., Rozek-Tedesco M. A., & Tindal 
G. (2002). An oral versus a standard 
administration of a large-scale mathematics 
test. The Journal of Special Education, 36(1 ), 
39-47. 


The purpose of this study was to examine 
whether students perform better on 
examinations when read aloud items via a 
video presentation. 


Idstein, B. E. (2003). Dictionary use during 
reading comprehension tests: An aid or a 
diversion? (Doctoral dissertation, Indiana 
University of Pennsylvania, 2003). Dissertation 
Abstracts International, 64/02, 483. 


The purpose of this study is to examine 
students’ use of dictionaries during reading 
comprehension exams, expecially since their 
use during exams has come under critical 
reexamination. 


Janson 1. B. (2002). The effects of testing 
accommodations on students’ standardized 
test scores in a northeast Tennessee school 
system (Doctoral dissertation, East Tennessee 
State University, 2002). Dissertation Abstracts 
International, 63/02, 557. 


Scores obtained by students who received 
special education services and did not receive 
accommodations in 1998 and/or 1999 were 
compared to scores obtained by the same 
students who did receive accommodations in 
later testing. Ninety-nine percent of students 
who received accommodations were given the 
read aloud accommodation. 


Kappel A. (2002). The effects of testing 
accommodations on subtypes of students with 
learning disabilities (Doctoral dissertation, 
University of Pittsburgh, 2002). Dissertation 
Abstracts International, 63/05, 1804. 


This study investigated the effects of two 
testing accommodations, extended time 
and oral administration, on the math test 
performance of students’ with learning 
disabilities 


Kettler R. J., Niebling, B. C., Mroch A. A., 
Feldman E. S., & Newell M. L. (2003). Effects 
of testing accommodations on math and 
reading scores: An experimental analysis 
of the performance of fourth and eighth 
grade students with and without disabilities. 
Madison, Wl: University of Wisconsin-Madison, 
Wisconsin Center for Education Research. 


Participants with disabilities were assigned 
accommodations based on their lEPs. 
Participants without disabilities were paired 
with students with disabilities. Each pair was 
tested under the accommodation condition 
during which both students in the pair received 
the same set of accommodations. 
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Landau, S., Russell, Gourgey, K., Erin, J. 

N., & Cowan, J. (2003). Use of talking tactile 
tablet in mathematics testing. Journal of Visual 
Impairment and Blindness, 97(2), 85-96. 


This study examined the extent to which use of 
the Talking Tactile Tablet had a positive impact 
on the mathematics performance of students 
who were visually impaired and/or had difficulty 
visualizing graphics and diagrams. To the 
extent possible, the study also explored the 
Talking Tactile Tablet’s impact on the difficulty 
of items. 


Macarthur, C. A., & Cavalier, A. R. (2004). 
Dictation and speech recognition technology 
as test accommodations. Exceptional Children, 
71( 1), 43-58. 


This study addressed the feasibility and validity 
of dictation using speech recognition software 
(Dragon Naturally Speaking, Version 4) and 
dictation to a scribe as test accommodations 
for students with learning disabilities. 


McKevitt, B. C., & Elliott, S. N. (2003). Effects 
and perceived consequences of using read 
aloud and teacher-recommended testing 
accommodations on a reading achievement 
test. The School Psychology Review, 32(A), 
583-600. 


The purpose of this study was to test students’ 
performance on a reading test with and without 
read-aloud accommodations. 


Reed, E. (2002). Wrong for the right reasons: 
Appropriate accommodations for students with 
learning disabilities and/or attention deficit/ 
hyperactivity disorder (Doctoral dissertation, 
Stanford University, 2002). Dissertation 
Abstracts International, 63/10, 3475. 


This study considered student performance 
and the appropriateness of accommodations 
at the level of the individual student through 
a think-aloud process. Students were asked 
to think-aloud while solving grade level 
mathematics problems. 


Tavani, C. M. (2004). The impact of testing 
accommodations on students with learning 
disabilities: An investigation of the 2000 
NAEP mathematics assessment (Doctoral 
dissertation, The Florida State University, 
2004). Dissertation Abstracts International, 
65/07, 2493. 


This study addressed the effects of 
accommodations on mathematical performance 
scores and examined additional variables 
that showed to have strong relationships with 
student’s test performances. 


Tindal G. (2002). Accommodating mathematics 
testing using a videotaped, read-aloud 
administration. Washington, DC: Council of 
Chief State School Officers. 


Students participated in both standard and 
videotaped test administrations. During the 
videotaped administration the test items were 
read aloud individually, in a paced format, with 
visual prompting of the answer choices. 


Trammell, J. K. (2003). The impact of 
academic accommodations on final grades in 
a postsecondary setting. Journal of College 
Reading and Learning, 34(1), 76-90. 


The purpose of this study was to determine 
whether postsecondary students with learning 
disabilities and/or Attention Deficit Disorder 
experienced a differential increase in end- 
of-term grades when they used academic 
accommodations required by the Americans 
with Disabilities Act. Students received 
one or more of the following academic 
accommodations throughout the school year: 
additional time to complete the tests, taping 
classes, testing in a separate room and books 
on tape. 
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Weston, T. J. (2003). NAEP validity studies: 
The validity of oral accommodation testing. 
Washington, DC: National Center for Education 
Statistics. 


This study examined three factors related 
to read aloud accommodation is math. First, 
accommodated test scores were compared 
to non-accommodated scores for a sample 
of students with learning disabilities. Second 
was the relative benefit that students with 
learning disabilities received from read aloud 
accommodations. Finally, the author examined 
the accuracy of information derived from 
accommodated and non-accommodated tests. 


Investigate the effects of accommodations on test score validity 


Barton K. E. (2002). Stability of constructs 
across groups of students with different 
disabilities on a reading assessment under 
standard and accommodated administrations 
(Doctoral dissertation, University of South 
Carolina, 2001). Dissertation Abstracts 
International, 62/12, 4136. 


The purpose of this study was to examine 
whether a similar construct is measured among 
students who are administered either the oral 
accommodation (OA) form or a regular form of 
an assessment. 


Bridgeman, B., Cline, F., & Hessinger, J. 

(2004). Effect of extra time on verbal and 
quantitative GRE scores. Applied Measurement 
in Education, 77(1), 25-37. 


The purpose of this study was to examine the 
effects of extra time on the Graduate Record 
Examination General Test. 


Bridgeman, B., Lennon, M. L., & Jackenthal, 

A. (2003). Effects of screen size, screen 
resolution and display rate on computer-based 
test performance. Applied Measurement in 
Education, 16{3), 191-205. 


This study evaluated the effects of variations 
in screen size, resolution, and presentation 
delay on verbal and mathematics scores. There 
were three screen display conditions (size and 
resolution) crossed with two presentation rate 
conditions (delay or no delay). 


Cahalan C., Mandinach E., & Camara W. J. 
(2002). Predictive validity of SAT 1: Reasoning 
test for test-takers with learning disabilities and 
extended time accommodations. New York, NY: 
The College Reporting Board. 


The study was conducted to examine the 
predictive validity of scores taken with an 
extended time accommodation. 


Hall S. E. H. (2002). The impact of test 
accommodations on the performance of 
students with disabilities (Doctoral dissertation, 
The George Washington University, 2002). 
Dissertation Abstracts International, 63/03, 902. 


Subjects received a variety of accommodations 
including extended time, dictated response, 
small group, and oral administration of the test. 


Huynh H., Meyer J. P., & Gallant-Taylor 
D. (2002). Comparability of scores of 
accommodated and non-accommodated 
testings for a high school exit examination 
of mathematics. Paper presented at the 
annual meeting of the National Council on 
Measurement in Education. New Orleans, LA. 


Students received a different form of the test 
that was designed to be appropriate for testing 
students with visual and hearing impairments. 
This form could have been provided in a 
regular print, large-print, or loose-leaf version. 
This form also may have been administered 
orally or by sign language to some of the 
students. 
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Huynh, H., Meyer, J. P., & Gallant, D. J. 
(2004). Comparability of student performance 
between regular and oral administrations 
for a high-stakes mathematics Test. Applied 
Measurement in Education, 77(1), 39-57. 


This study examined the effect of oral 
administration accommodations on test 
structure and student performance on the 
mathematics portion on the South Carolina 
high School Exit Examination. 


Kobrin, J. L, & Young, J. W. (2003). 

The cognitive equivalence of reading 
comprehension test items via computerized 
and paper-and-pencil administration. Applied 
Measurement in Education, 1 6(2) , 1 1 5-1 40. 


The cognitive equivalence of computerized and 
paper-and-pencil reading comprehension tests 
was investigated. 


Meloy L. L., Deville C., & Frisbie D. (2002). The 
effect of a read aloud accommodation on test 
scores of students with and without a learning 
disability in reading. Remedial and Special 
Education, 23(A), 248-255. 


Students were randomly assigned to two 
experimental conditions. In one condition the 
test was administered according to standard 
procedures; in the other condition the test was 
read aloud to the students. 


Pomplun M., Frey S., & Becker D. (2002). 
The score equivalence of paper-and-pencil 
and computerized versions of a speeded test 
of reading comprehension. Educational and 
Psychological Measurement, 62(2), 337-354. 


Students took two forms of a test in 
computerized and paper-and-pencil versions. 


Scheuneman J. D., Camara W. J., Cascallar 
A. S., Wendler C., & Lawrence 1. (2002). 
Calculator access, use, and type in relation 
to performance in the SAT 1: Reasoning test 
in mathematics. Applied Measurement in 
Education, 75(1), 95-112. 


After completing the test, participants were 
asked to respond to a set of three questions 
about their use of a calculator during the test. 


Thornton A. E., Reese L. M., Pashley P. J., & 
Dalessandro S. P. (2002). Predictive validity of 
accommodated LSAT scores. Pennsylvania: 
Law School Admission Council. 


The validity of scores obtained by test 
takers who were administered the test 
under nonstandard time conditions (i.e., 
accommodations that included extended time) 
was investigated. 


Woods, K. (2004). Deciding to provide a reader 
in examinations for the General Certificate 
of Secondary Education (GCSE): Questions 
about validity and inclusion. British Journal of 
Special Education, 31(3), 122-124. 


This study examined the effects of providing a 
read-aloud accommodated test to examinees 
in England, Wales and Northern Ireland. Its 
purpose was to report whether reading age and 
self-prediction were accurate indicators of the 
need for read-aloud accommodations. 


Study institutional factors, teacher judgment, or student desirability of accommodation 
use 


Bolt, S. E. (2004). Using DIF analyses to 
examine several commonly-held beliefs 
about testing accommodations for students 
with disabilities. Paper presented at the 
annual meeting of the National Council on 
Measurement in Education, San Diego, CA. 


The purpose of this study was to examine 
the extent of data based support for several 
commonly held opinions about testing 
accommodations for students with disabilities. 
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Bielinski, J., Sheinker, A., & Ysseldyke, J. 
(2003). Varied opinions on how to report 
accommodated test scores: Findings based on 
CTB/McGraw-Hill’s framework for classifying 
accommodations (Synthesis Report 49). 
Minneapolis, MN: University of Minnesota, 
National Center on Educational Outcomes. 


A list of 44 different accommodations 
categorized into presentation, response, 
setting, and timing accommodations were 
used. 


Cisar, C. A. (2004). Teacher’s knowledge 
about accommodations and modifications 
as they relate to assessment (Doctoral 
dissertation, Loyola University of Chicago, 
2004). Dissertation Abstracts International, 
65/10, 3754. 


The purpose of this study was to determine 
if there was a difference among staff 
(administrators, general education teachers, 
special education teachers, elective area 
teachers: art, PE, music) servicing students 
with special needs in their ability to distinguish 
accommodations from modifications and their 
ability to use them in assessment activities. 


Gagnon, J. C., & McLaughlin, M. J. (2004). 
Curriculum, assessment, and accountability 
in day treatment and residential schools. 
Exceptional Children, 70(3), 263-283. 


This study determined school-level curricular, 
assessment, and accountability policies and 
practices in private and public day treatment 
and residential schools for elementary age 
children with emotional or behavioral disorders. 


Helwig, R., & Tindal, G. (2003). An 
experimental analysis of accommodation 
decisions on large-scale mathematics tests. 
Exceptional Children, 69(2), 211-225. 


This study tested the accuracy with which 
special education teachers determine which 
students need read-aloud accommodations. 
An additional goal of this study was to develop 
a profile of students who benefit from this 
type of accommodation by contrasting their 
achievement levels in reading and basic math 
skills. 


Jackson, L. M. (2003). The effects of testing 
adaptations on students’ standardized test 
scores for students with visual impairments 
in Arizona (Doctoral dissertation, University 
of Arizona, 2003). Dissertation Abstracts 
International, 64/10, 3644. 


The purpose of this study was to determine 
the relationship of testing modifications, a type 
of adaptation, and the effects of demographic 
information on students’ standardized test 
scores for students in Arizona who have visual 
impairments including those with additional 
disabilities. 


Katzman, L. 1. (2004). Students with disabilities 
and high stakes testing: What can the students 
tell us? (Doctoral dissertation, Harvard 
University, 2004). Dissertation Abstracts 
International, 65/05, 1732. 


This study examined the qualitative aspects of 
high stakes testing and accommodations for 
students with disabilities by asking students to 
explain their understanding and experiences 
of participating in a large-scale high school 
examination. 


Nickerson, B. (2004). English language 
learners, the Stanford Achievement Test, and 
perceptions regarding the effectiveness of 
testing accommodations: A study of eighth 
graders (Doctoral dissertation, The George 
Washington University, 2004). Dissertation 
Abstracts International, 65/07, 2465. 


This study examined the perception of students 
with regard to the effectiveness of testing 
accommodations in assisting them to more 
accurately demonstrate content knowledge and 
skills. 
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Shriner, J. G., & Destefano, L. (2003). 
Participation and accommodation in state 
assessment: The role of individualized 
education programs. Exceptional Children, 
69(2), 147-161. 


The purpose of this study was to test if 
training sessions help special education 
teachers and administrators use, and report 
accommodations on test day. 


Examine patterns of errors across items or tests 


Barton, K. E., & Huynh, H. (2003). Patterns of 
errors made by students with disabilities on a 
reading test with oral reading administration. 
Educational and Psychological Measurement, 
63( 4), 602-614. 


This study examined differences in the types of 
errors made by students with disabilities on a 
multiple choice reading test administered under 
oral reading accommodations. 


Choi, S. W ., & Tinker T. (2002). Evaluating 
comparability of paper-and-pencil and 
computer-based assessment in a K-1 2 setting. 
Paper presented at the annual meeting of 
the National Council on Measurement in 
Education, New Orleans, LA. 


Students took alternate forms of the test under 
computer administration and paper-and-pencil 
administration. 


Meta-analysis 


Sired, S. G., Li, S., & Scarpati, S. (2003). 

The effects of test accommodations on 
test performance: A review of the literature 
(Research Report 485). Amherst, MA: Center 
for Educational Assessment. 


The purpose of this study was to analyze 
existing research in the area of test 
accommodations. 


Tindal, G., & Ketterlin-Geller, L.R. 

(2004). Research on mathematics test 
accommodations relevant to NAEP testing. 
Washington, DC: National Assessment 
Governing Board. 


The purpose of this study was to synthesize 
research pertaining to differential item 
functioning. 
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Appendix B — Summary of Type of Assessment 



Author 


Norm-Referenced 
and Other 
Standardized 
Tests 


State Criterion 
Referenced 
Tests of 
Performance 
Assessments 


School or 
District- 
designed Tests 


Other 


Barton (2002) 




The reading 
portion of a 
secondary 
level statewide 
assessment 
was used as 
the dependent 
variable. 






Barton (2003) 




The study 
was based on 
statewide data 
from the 1996, 
1997, and 1998 
administration 
of the Oral 
Accommodation 
form of South 
Carolina’s 
statewide 
reading test. 






Barton (2003) 








Questionnaires 
addressing 
computer literacy, 
accommodations, 
and accessibility 
were given to 
all students 
participating. 
Writing prompts 
were scored 
via paper and 
online formats 
and analyzed 
via descriptive 
and inferential 
analysis. 
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Bielinski (2003) 








This study is a 
survey of the 
perceptions 
held by people 
familiar with 
policy or 
research on the 
way in which 
test scores are 
influenced by 
accommodations 
and how scores 
obtained under 
accommodated 
conditions are 
to be treated 
in reporting. 
Participants 
marked each 
accommodation 
as either: 1) 
measuring the 
construct in the 
same way, 2) 
changing the 
meaning of the 
test score, or 
3) not having 
definitive 
evidence to 
place it in either 
category one or 
two. 


Bolt (2002) 




Level analyses 
were conducted 
on multiple 
choice math 
items from 
the Missouri 
Assessment 
Program. 






Bolt (2004) 




A series of DIF 

analyses were 

conducted 

across three 

statewide 

achievement 

tests. 
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Bridgeman 

(2003) 


The participants 
were tested using 
a computerized 
version of 
questions from the 
SAT 1 : Reasoning 
Test. The test was 
given in various 
formats crossing 
screen size, 
resolution, and 
presentation rate. 








Bridgeman 

(2004) 


The verbal and 
quantitative 
sections of the 
Graduate Record 
Examination were 
experimentally 
administered with 
standard time 
limits and at 1 .5 
times the normal 
allotted time. 








Buehler (2002) 


Students were 
administered the 
reading subtests 
of the California 
Achievement 
Tests, Fifth Edition 
(CAT/5) and the 
rapid-naming 
subtests of the 
Comprehensive 
Test of 
Phonological 
Processing 
(CTOPP). 








Burch (2002) 








Several math 
tests were 
administered to 
assess the math 
problem-solving 
performance of 
students with 
and without 
disabilities. 
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Cahalan (2002) 


SAT 1 test scores 
and self-reported 
high school grade 
point average 
(HSGPA) to 
predict first year 
grade point 
average (FGPA). 








Choi (2002) 




Items from a 
statewide math 
and reading test 
were analyzed. 
Surveys were 
also conducted 
to determine 
student computer 
experience. 






Cisar (2004) 








The researcher 
created a 
questionnaire 
that contained 
four sections of 
data collection: 
professional 
development, 
identification, 
accommodation 
information, and 
demographics. 


Crawford (2004) 




The students 
completed a 30- 
minute Oregon 
state writing 
performance 
assessment 
as well as a 
longer writing 
performance 
assessment 
which was 
completed 
over 3 days. 
Assessments 
were evaluated 
on four 
traits (ideas, 
organization, 
conventions, 
and sentence 
fluency). 
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Dempsey (2003) 


Subjects were 
administered 
the LSAT under 
extended time 
conditions. 








Elbaum (2004) 




A test made 
up of 3 rd -5 th 
grade reading 
passages. 






Elliott (2003) 


Students 
completed 
alternate 
short forms of 
standardized 
mathematics tests 
developed from 
the TerraNova 
Level 18 

mathematics test. 








Elliott (2004) 


The students 
completed one 
of two alternate 
short forms of 
standardized 
mathematics 
tests developed 
from the 
TerraNova level 
18 mathematics 
test. Upon 
completion of 
the test, students 
completed an 
accommodations 
survey about 
their reactions to 
working on the 
test under the 
accommodated 
or standard 
conditions. 
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Gagnon (2004) 








Two surveys 
were developed 
based on 
a review of 
literature, 
consideration 
of current 
educational 
reform, etc. 
There were five 
sections on each 
survey, however, 
this study 
focuses only 
on the sections 
on curricular 
policies and 
accountabilities. 


Hall (2002) 




Results from 
the 2000 
administration 
of the Maryland 
School 
Performance 
Assessment 
Program (MSAP) 
were used as 
the dependent 
variable in 
this post hoc 
analysis. 






Hansen (2002) 








The study 
examined 
the use of 
a prototype 
testing system 
that utilizes 
synthesized 
speech to 
deliver questions 
on reading 
and listening 
comprehension 
tests. 
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Helwig (2002) 




Items were 
selected from 
a statewide 
multiple-choice 
math test; items 
considered more 
difficult to read 
were specifically 
analyzed. 






Helwig (2003) 








For each student, 
the appropriate 
teacher 
completed a 
survey that rated 
the student’s 
skill level in both 
reading and 
mathematics 
on a 5-point 
Likert scale. The 
teacher also 
predicted which 
students would 
benefit most from 
a read-aloud 
accommodation. 
The students 
were also 
tested with a 
standardized 
reading and 
basic math skills 
test. 


Huynh (2002) 




Performance 
on the math 
portion of the 
South Carolina 
High School Exit 
Examination 
was used as 
the dependent 
variable. 
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Huynh (2004) 




The mathematics 
section of 
the state exit 
examination 
was given at 
grade 1 0. All 
students in this 
assessment 
program were 
given unlimited 
time to complete 
the test. 






Idstein (2003) 




The students 
were given a 
test called a 
bagrut, a reading 
comprehension 
test in English. 
The reading 
task component 
included 
two reading 
passages. 
Students were 
permitted use 
of the Oxford 
Student’s 
Dictionary 
for Hebrew 
Speakers. 






Jackson (2003) 




The dependant 
variable in 
this study was 
the Stanford 
Achievement 
Test, 9 th Edition. 






Janson (2002) 




Performance on 
the Tennessee 
Comprehensive 
Assessment 
Program (TCAP) 
achievement 
test was used as 
the dependent 
variable. 

Results from the 
following years 
were analyzed: 
1998, 1999, 
2000, and 2001. 
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Kappel (2002) 




All students were 
administered 
items from the 
Mathematics 
subtest of 
the California 
Achievement 
Test under 
several 

conditions, with 
and without 
accommodations. 






Katzman (2004) 








The students 
were interviewed 
after completing 
the 10 th grade 
MCAS, an 
assessment 
students must 
pass in order to 
graduate. 


Kettler (2003) 


Two math 
subtests and two 
reading subtests 
from research 
editions of the 
TerraNova Multiple 
Assessment 
Battery were 
used to assess 
participants’ 
achievement 
levels. 








Kobrin (2003) 


Subjects 
were tested 
using reading 
comprehension 
items from 
the ETSGRE 
General Test 
Big Book. Two 
long passages 
consisting of 
55 lines and 7 
corresponding 
test items were 
selected and 
administered via 
computer and 
paper-and pencil 
formats. 
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Landau (2003) 








For this 
study, three 
mathematics 
test forms, each 
containing four 
items, were 
administered to 
the participants. 
Each of the 12 
items referenced 
a diagram 
or graphical 
element. The 
items focused 
on geometry, 
measurement, 
patterns and 
relations, and 
statistics and 
probability. 


Macarthur 

(2004) 


Two measures 
were used to 
evaluate accuracy 
of speech 
recognition: 
sentence probes 
and word-list 
probes. Students 
wrote essays 
under the following 
three conditions: 
using handwriting, 
using a scribe, 
and using speech 
recognition 
software. 
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McKevitt (2003) 


Two forms of 
a research 
version of the 
TerraNova Multiple 
Assessments 
Reading test 
(eighth-grade 
level) were used 
in this study. After 
completing the 
test, students 
completed 
a survey 
about testing 
accommodations. 
Teachers also 
completed a 
survey about their 
perceptions of 
the effectiveness 
of testing with 
accommodations. 








Meloy (2002) 




Participants were 
administered 
four tests from 
the Iowa Tests 
of Basic Skills 
(ITBS): Science, 
Usage and 
Expression, 

Math Problem- 
Solving, Data 
Interpretation, 
and Reading 
Comprehension. 






Nickerson 

(2004) 








Interviews 
were used to 
elicit student 
perceptions 
about which 
testing 

accommodations 
effectively 
assisted the 
students in 
demonstrating 
what he/she 
knows and can 
do on the SAT-9 
test. 
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Pomplun (2002) 


Participants were 
administered 
multiple forms of a 
reading placement 
test, namely the 
Nelson-Denny 
Reading Test. 








Reed (2002) 


Three subtests of 
the Woodcock- 
Johnson Revised 
Tests and a subset 
of tasks from 
the Wechsler 
Intelligence Scale 
for Children 
were used in 
this study. The 
think-aloud data 
were analyzed in 
conjunction with 
pre-test ability 
and achievement 
measures to 
determine why 
students got 
test items right 
or wrong. Test 
items were 
thought to function 
appropriately for 
students who got 
the answer wrong 
and displayed no 
mastery of the 
construct being 
assessed during 
the think-aloud 
process. 
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Scheuneman 

(2002) 


The participants 
were administered 
the SAT 1: 
Reasoning test 
in Mathematics 
in domestic test 
centers. Questions 
about use of the 
calculator on the 
test were placed in 
the answer sheets 
for the November 
1996 and the 
November 1997 
administrations of 
the examination. 








Shriner (2003) 




The IEP analyses 
that are reported 
were conducted 
twice (1999, 
2000). During 
the intervening 
year, site-based 
management 
teams of special 
and general 
education 
teachers and 
administrators 
participated 
in a series of 
training sessions 
and follow-up 
conducted by 
the researchers 
during March 
1999 through 
February 2000. 
The decisions of 
the same groups 
of trained IEP 
team members 
were followed 
through both 
years in a 
longitudinal 
design. 







NCEO 



43 






Tavani (2004) 


The 2000 NAEP 

Mathematics 

assessment. 








Thornton (2002) 


The measure used 
to assess the 
predictive validity 
of the LSAT for 
participant groups 
was law school 
first year average 
grades. 








Tindal (2002) 




Fourth and 
seventh grade 
levels of a 
multiple choice 
mathematics 
test were 
administered 
in two 

different forms 
(videotaped and 
standard) in a 
counterbalanced 
order. 






Trammell (2003) 








End of term 
grades for each 
subject were 
compared and 
contrasted in this 
study. 


Weston (2003) 


All subjects took 
two matched forms 
of a mathematics 
assessment 
based on NAEP 
items: one form 
accommodated 
(read-aloud), and 
one form non- 
accommodated. 

All students also 
took the first 
part of the Third 
Grade TerraNova 
Reading test to 
determine reading 
level. 
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Woods (2004) 



The GCSE is 
a high stakes 
examination in 
England, Wales 
and Northern 
Ireland. It is used 
as a predictor of 
future educational 
achievement. 
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Appendix C— Subject Area Studied (by Author) 



Author 


Math 


Reading/ 

Language 

Arts 


Science 


Writing 


Social 

Studies 


No 

Specific 

Content 

Area 


Total 


Barton (2002) 




X 










1 


Barton (2003) 




X 










1 


Barton (2003) 








X 






1 


Bielinski (2003) 














N/A 


Bolt (2002) 




X 










1 


Bolt (2004) 


X 


X 










2 


Bridgeman (2003) 


X 


X 










2 


Bridgeman (2004) 


X 


X 










2 


Buehler (2002) 




X 










1 


Burch (2002) 


X 


X 










2 


Cahalan (2002) 












X 


1 


Choi (2002) 


X 


X 










2 


Cisar (2004) 














N/A 


Crawford (2004) 








X 






1 


Dempsey (2003) 












X 


1 


Elbaum (2004) 




X 










1 


Elliott (2003) 


X 












1 


Elliott (2004) 


X 












1 


Gagnon (2004) 














N/A 


Hall (2002) 




X 


X 








2 


Hansen (2002) 




X 










1 


Helwig (2002) 




X 










1 


Helwig (2003) 


X 


X 










2 


Huynh (2002) 


X 












1 


Huynh (2004) 








X 






1 


Idstein (2003) 




X 










1 


Jackson (2003) 


X 


X 










2 


Janson (2002) 


X 




X 




X 




3 


Kappel (2002) 


X 












1 


Katzman (2004) 














N/A 


Kettler (2003) 


X 


X 










2 


Kobrin (2003) 




X 










1 


Landau (2003) 


X 












1 


Macarthur (2004) 


X 












1 


McKevitt (2003) 




X 










1 
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Meloy (2002) 


X 


X 


X 








3 


Nickerson (2004) 














N/A 


Pomplun (2002) 




X 










1 


Reed (2002) 


X 


X 










2 


Scheuneman 

(2002) 


X 












1 


Shriner (2003) 












X 


1 


Tavani (2004) 


X 












1 


Thornton (2002) 












X 


1 


Tindal (2002) 


X 












1 


Trammell (2003) 












X 


1 


Weston (2003) 


X 


X 










2 


Woods (2004) 












X 


1 



48 



NCEO 







Appendix D — Type of Accommodation Studied (by Author) 



NCEO 



49 





Other 


























N/A 






X 






N/A 


















Multiple 








X 




X 








X 




















X 






X 


X 






Techno- 

logical 

Aid 


Video/ 

Techno. 

Aid 










































X 












Timing 


Multiple 

Day 




























X 


























Extended 

Time 
















X 


X 




X 








X 




X 


X 


















Setting 


Individual/ 

Small 

Group 

Setting 






















































Response 


Calculator 






















































Word 

Processor 






















































Dictated 

Response 






















































Presentation 


Large 

Print 






















































Dictionary 

Use 




















































X 


Computer 

Administration 






X 








X 










X 






























Oral 

Presentation 


X 


X 






X 


































X 






X 






Author 


Barton (2002) 


Barton (2003) 


Barton (2003) 


Bielinski (2003) 


Bolt (2002) 


Bolt (2004) 


Bridgeman (2003) 


Bridgeman (2004) 


Buehler (2002) 


Burch (2002) 


Cahalan (2002) 


Choi (2002) 


Cisar (2004) 


Crawford (2004) 


Dempsey (2003) 


Elbaum (2004) 


Elliott (2003) 


Elliott (2004) 


Gagnon (2004) 


Hall (2002) 


Hansen (2002) 


Helwig (2002) 


Helwig (2003) 


Huynh (2002) 


Huynh (2004) 


Idstein (2003) 



50 



NCEO 













N/A 






















X 














i n 






X 




X 
















X 






X 






X 






- 














X 






























C\J 












































- 


































X 






















































o 




























X 
















- 












































o 
















X 




























- 


X 










































- 












































- 












X 












X 




















in 




X 














X 


X 
















X 




X 


X 


- 


Jackson (2003) 


Janson (2002) 


Kappel (2002) 


Katzman (2004) 


Kettler (2003) 


Kobrin (2003) 


Landau (2003) 


Macarthur (2004) 


McKevitt (2003) 


Meloy (2002) 


Nickerson (2004) 


Pomplun (2002) 


Reed (2002) 


Scheuneman (2002) 


Shriner (2003) 


Tavani (2004) 


Thornton (2002) 


Tindal (2002) 


Trammell (2003) 


Weston (2003) 


Woods (2004) 





NCEO 



51 












Appendix E — Summary of Participants 



Author 


Number of Study 
Participants and Percent 
with Disabilities 


Grade-Level of 
Participants 


Types of Disabilities of 
Students Included in the 
Sample (as labeled by 
authors) 


Barton (2002) 


5,921 (28% students with 
disabilities) 


10 th , 12 th Grades 


Learning disability, emotional 
disability, mental retardation, 
speech, language, vision or 
hearing impairment, physical 
disability 


Barton (2003) 


2,924 (80% students with 
disabilities.) 


12 ,h Grade 


Learning disability, mentally 
challenged, emotional 
disability, physical disability, 
communication disability 


Barton (2003) 


630 (50% students with 
disabilities) 


4 th — 6 th Grades 


Emotionally disturbed, 
learning disabled, physically 
disabled, speech/language, 
hearing impaired 


Bielinski (2003) 


86 (% N/A) 


State assessment 
directors, state special 
education directors, or 
individuals who have 
printed research on 
test accommodations 
or have published 
accommodations 
research. 


N/A 


Bolt (2004) 


More than 1 ,000 (Number of 
students with disabilities not 
reported) 


Elementary School, 
High School 


Not Reported 


Bolt, Bielinski J (2002) 


3,013 (67% students with 
disabilities) 


4 th Grade. 




Bridgeman (2003) 


357 (Number of students with 
disabilities not reported) 


11 th Grade 


Not Reported 


Bridgeman (2004) 


7,653 


Post secondary students 


Not reported 


Buehler (2002) 


49 (45% students with 
disabilities) 


K-5 ,h Grades 


Not reported 


Burch (2002) 


49 (67% students with 
disabilities) 


4 th Grade 


Reading deficit 


Cahalan (2002) 


34,000 (Number of students 
with disabilities not reported) 


College students 


Not reported 


Choi (2002) 


1 600 (Number of students with 
disabilities not reported) 


3 rd , 1 0 th Grades 


Not reported 


Cisar (2004) 


505 (% N/A) 


School staff members 


N/A 


Crawford (2004) 


353 (14% students with 
disabilities) 


5 th — 8 th Grades 


Not reported 
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Dempsey (2003) 


200 


Post collegiate adults 


Participants reported either 
attention deficit or a learning 
disability limiting performance 
under standard test conditions 


Elbaum (2004) 


311 (74% students with 
disabilities) 


6 th — 1 0 th grade students 


Not Reported 


Elliott (2003) 


69 (33% students with 
disabilities) 


8 th Grade 


Mild learning disabilities, 
emotional disabilities, 
behavioral disabilities, mild 
physical disabilities, speech 
and language disabilities, mild 
cognitive disabilities 


Elliott (2004) 


97 (24% students with 
disabilities) 


8 th Grade 


Mild learning disabilities, 
emotional disabilities, 
behavioral disabilities, mild 
physical disabilities, speech 
and language disabilities, mild 
cognitive disabilities 


Gagnon (2004) 


500 (% N/A) 


Principals and teachers 


N/A 


Hall (2002) 


1 92,000 (6% students with 
disabilities) 


5 th Grade 


Not reported 


Hansen (2002) 


17 (100% students with 
disabilities) 


Ages 1 7 to 55 


Legally blind 


Helwig (2002) 


1 ,343 (20% students with 
disabilities) 


4 th , 5 th , 7 th , and 8 th 

Grades 


Reading 


Helwig (2003) 


1,218 (20% students with 
disabilities) 


4 th — 8 th Grades 


Learning disability, language 
impairment, serious emotional 
disturbance, mental 
retardation 


Huynh (2002) 


90,000 (8% students with 
disabilities) 


1 0 th Grade 


Speech, hearing, visual, 
orthopedic, emotional, learning 
disabilities, educable mentally 
retarded, and trainable 
mentally retarded 


Huynh (2004) 


89,214 (4% students with 
disabilities). 


1 0 th Grade 


Speech, hearing, visual, 
orthopedic, emotional, learning 
disabilities, educable mental 
retardation, and trainable 
mental retardation 


Idstein (2003) 


63 (Number of students with 
disabilities not reported) 


11 th Grade 


Not reported 
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Jackson (2003) 


71 (100% students with 
disabilities) 


2 nd — 9 th Grades 


Visual impairments, including 
students with additional 
disabilities. The students 
in this study attended either 
a specialized school for 
the visually impaired or a 
public school with support 
from teachers of the visually 
impaired 


Janson (2002) 


448 (1 00% students with 
disabilities) 


2 nd -8 th Grades 


Twelve disability groups 
represented 


Kappel (2002) 


47 (100% students with 
disabilities). 


5 th grade 


Students with learning 
disabilities were categorized 
into one of three groups based 
on patterns of performance 
on a large-scale achievement 
test, which was administered 
to all participants. A fourth 
group of students without 
disabilities was also included 
in the sample. 


Katzman (2004) 


36 (67% students with 
disabilities) 


1 0 th Grade 


Not Reported 


Kettler (2003) 


1 96 (44% students with 
disabilities) 


4 th , 8 th Grades 


Not reported 


Kobrin (2003) 


48 (Number of students with 
disabilities not reported) 


College students 


Not reported 


Landau (2003) 


8 ( 100 % students with 
disabilities) 


9 th Grade-College 
students 


Visual impairments resulting in 
a need for braille 


Macarthur (2004) 


31 ( 68 % students with 
disabilities) 


High school students 


Not reported 


McKevitt (2003) 


79 (51% students with 
disabilities) 


8 th Grade 


Not reported 


Meloy (2002) 


260 (24% students with 
disabilities) 


6 th — 8 th grades 


Reading deficit 


Nickerson (2004) 


30 (Number of students with 
disabilities not reported) 


8 th Grade 


Not Reported 


Pomplun (2002) 


215 (Number of students with 
disabilities not reported) 


High school, post- 
secondary 


Not reported 


Reed (2002) 


36 (78% students with 
disabilities) 


8 th Grade 


Learning disability, attention 
disability 


Scheuneman (2002) 


417,000 (Number of students 
with disabilities not reported) 


11 th , 12 th Grades 


Not reported 


Shriner (2003) 


651 (92% students with 
disabilities) 


3 rd — 1 1 th Grades 


Learning disability, behavior 
disorder, mental retardation, 
speech/language, orthopedic 
impairment, visual impairment, 
hearing impairment, autism, 
other health impairment 
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Sireci (2003) 


N/A 


N/A 


N/A 


Tavani (2004) 


42,453 (5% students with 
disabilities) 


4 th Grade, 8 th Grade, 12 th 
Grade 


Not Reported 


Thornton (2002) 


123,065 (1% students with 
disabilities) 


Law school students 


Attention deficit, learning 
disability, neurological 
impairment, and visual 
impairment subgroups. 


Tindal (2002) 


2,000 (40% students with 
disabilities) 


4 th , 5 th , 7 th , and 8 th 

Grades 


Mental, speech, orthopedic, 
traumatic, learning disability, 
hearing, visual, autism 


Tindal (2004) 


N/A 


N/A 


N/A 


Trammell (2003) 


61 (100% students with 
disabilities) 


Undergraduate college 
students 


Learning disability, attention 
deficit 


Weston (2003) 


119 (54% students with 
disabilities) 


4 th Grade 


Not reported 


Woods (2004) 


38 (Number of students with 
disabilities not reported) 


High school students 


Not reported 
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Appendix F — Summary of Research Results 



Author 




Barton (2002) 


The results indicate that a similar construct was measured among students 
with and without disabilities taking the regular form. The results also indicate 
that a similar construct was measured among students with and without 
disabilities taking the oral accommodation form. 


Barton (2003) 


The study indicates that when errors are used as an extra factor in exploring 
the nature of proficiency, the reading construct varies only slightly across 
disability groups. The results indicate that it is safe to apply the same 
meaning to test scores for these groups even when the test is administered 
under different accommodations. 


Barton (2003) 


Results indicated that students without disabilities obtain higher scores than 
those with disabilities. There were significant differences between essays 
scored online and by hand scorers; however, there were no differences 
between students’ performance online or on paper. 


Bielinski (2003) 


The results show that the extent of agreement about how accommodated 
scores should be treated depends on the accommodation. The study also 
shows how deep-seated beliefs lead some respondents to consider almost 
no accommodation as changing the construct, whereas other respondents 
consider almost all accommodations as influencing the construct being 
measured. 


Bolt (2002) 


The read aloud accommodation did not appear to improve score 
comparability for students with reading disabilities when compared to students 
without disabilties. More items displaying differential item functioning (DIF) 
were identified for those who received the accommodation than for those who 
did not receive the accommodation. 


Bolt (2004) 


Results provide some support for the commonly held beliefs, although results 
were not always consistent across datasets. The results also point to the 
challenge of appropriately assessing the skills and knowledge of students 
with disabilities using currently available assessments. 


Bridgeman (2003) 


Screen display conditions and presentation rate had no significant effect on 
math scores. Verbal scores were a quarter of a standard deviation higher with 
the larger, highest resolution display. 


Bridgeman (2004) 


Extra time added about 7 points to verbal scores and 7 points to quantitative 
scores. The accommodation appeared to have a greater impact on the 
quantitative scores of lower ability examinees. 


Buehler (2002) 


Results indicated that students with learning disabilities did not use 
significantly more time on the CATO, even when given the option. Students 
with disabilities did not receive any differential benefit from the use of the 
additional time accommodation. Although there were no differences in the 
reliability of the CATO due to the accommodation of additional time, the 
validity of the CATO was lower for students with learning disabilities who 
received additional time. The CTOPP was not found to be a useful predictor 
of students that would benefit from additional time on the CATO. 
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Burch (2002) 


In comparison to students without LD, students with both reading and 
math disabilities experienced large accommodation boosts in the following 
conditions: CRT, V, and CA. Students with only reading disabilities did not 
receive an accommodation boost larger than students without LD under any 
condition. 


Cahalan (2002) 


In general, the revised SAT was to be positively correlated with FGPAfor 
students who took the test with extended time accommodations for a learning 
disability. SAT scores were fairly accurate predictors of FGPA for students 
with learning disabilities. In the majority of cases when HSGPA was used 
along with SAT test scores, the predictive validity of FGPA was increased. 


Choi (2002) 


Item difficulty estimates did not appear to be the same across modes, 
particularly on the reading test and at the third grade level. When comparing 
identical items that were administered across both modes, computer items 
tended to have higher item difficulty estimates. Scrolling reading passages 
on computer screens seemed to have interfered with a student’s test-taking 
behavior, particularly for younger students. 


Cisar (2004) 


Special educators and administrators tended to score higher than general 
and elective area teachers in their ability to distinguish and use assessment 
modifications. 


Crawford (2004) 


A significant interaction was found at grade 5 between length of time allotted 
for the assessment and the students’ education classification. Grade 5 
students performed significantly better on the 3-day writing assessment, 
with students in special education benefiting the most. The eighth-graders 
performed no better on the 3-day assessment than in the 30-minute 
assessment. Significant differences were reported across certain writing traits. 


Dempsey (2003) 


The verbal comprehension index was identified as the score that most closely 
predicts LSAT performance. This study found that scores earned under 
accommodated conditions are better than those earned under standard 
conditions. 


Elbaum (2004) 


As a group, students’ test performance did not differ in the two conditions, 
and students with learning disabilities did not benefit more from the 
accommodation than students without learning disabilities. However, 
students with learning disabilities showed greater variability in their response 
to the accommodation. 


Elliott (2003) 


The performance of students with disabilities was highly similar to 
the performance of students without disabilities under standard time 
and extended time testing conditions. Overall, the provision of the 
accommodation, extended time, did not significantly improve scores of 
students with disabilities on the math test. 


Elliott (2004) 


The scores achieved in the extended time condition were higher than the 
scores achieved in the standard condition for all groups. However, the scores 
of students with disabilities did not improve significantly more than those of 
the students without disabilities when given extra time. A large proportion 
of survey respondents across all three groups expressed approval of the 
extended time condition. 
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Gagnon (2004) 


No significant differences existed between teacher and principal reports of 
school-level curricular, assessment, and accountability policies. However, 
several statistically significant differences existed in school policies for 
schools that served students from a single district and those that served 
students from across a single state or more than one state. Approximately 
two-thirds of all the schools administered district and state assessments and 
most schools used their state’s accommodation guidelines. 


Hall (2002) 


The study found that nearly 75% of fifth grade students with disabilities who 
participated in the MSPAP 2000 received test accommodations. Nearly half 
of these students received reading accommodations that invalidated the 
construct of the reading test, and almost a third of these students received 
writing accommodations that invalidated the language usage test. These 
reading and writing accommodations resulted in the reading and language 
usage scores of thousands of students with disabilities not being reported. 
Seventy-five percent of students with disabilities received accommodations 
and a third of these students met the satisfactory standard in the subject 
areas assessed. Also, although 25% of students with disabilities did 
not receive accommodations, about one-third of these students met the 
satisfactory standard. 


Helwig (2002) 


Elementary students with disabilities tended to perform better under the 
read aloud condition; elementary general education students did not appear 
to receive a similar benefit from the accommodation. For middle school 
students, no significant interactions were found. 


Helwig (2003) 


The teachers in the study were not effective in their recommendations 
of which students would, and would not, benefit from having math tests 
items read aloud. Teachers’ ratings of their student’s needs for testing 
accommodations coincided with actual students performance only half the 
time. The study found no connection between performance on reading and 
basic math skills tests and the need for accommodations. 


Huynh (2002) 


Accommodations provided on the separate form did not appear to 
substantially change the internal test structure. Students with disabilities 
taking the regular test form did not perform as well as other groups; students 
with disabilities taking the accommodated form performed as well as students 
without disabilities taking the regular form. 


Huynh (2004) 


It was found that the test structure remained rather stable across the three 
groups. Controlling for student background variables, disabled students 
under oral administration performed better than disabled students on the 
non-accommodated format. On the non-accommodated format, students with 
disabilities fared worse than general education students. 


Idstein (2003) 


Qualitative results show the better students do well in less time than it takes 
weaker students to achieve lower grades. Weaker students rely excessively 
on their dictionaries and do not trust themselves. Dictionary use does not 
effect the scores or test time of the better students, and may actually slow 
down and negatively effect the scores of weaker students. 


Jackson (2003) 


Scores did not differ among individual students due to demographic factors. 
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Janson (2002) 


Students who received special education services and received 
accommodations experienced significant gains in scores in science and social 
studies in the year they were initially granted accommodations. There were 
substantial gains in science and social studies in 2000 for students initially 
receiving accommodations. There were significant gains in social studies and 
math scores in 2001 for students initially receiving accommodations. 


Kappel (2002) 


In general, no increase in scores was found when testing accommodations 
were used, and no differential response to using accommodations was found 
among subgroups or those without disabilities. 


Katzman (2004) 


The students with disabilities reported that they did not feel that they were 
prepared to take the MCAS because they believed that they were not taught 
the content on the test. Many students were not enrolled in courses that 
prepared them for the MCAS. 


Kettler (2003) 


Among fourth grade students, accommodations provided a larger effect 
for students with disabilities than students without disabilities on both the 
mathematics and the reading tests. Among eighth grade students, the 
effects of testing accommodations depended on the test content (math 
versus reading). The effects of testing accommodations on the math tests 
were somewhat higher for students with disabilities than for students without 
disabilities. Conversely, the effects of testing accommodations on the reading 
tests were slightly lower for students with disabilities than for students without 
disabilities. 


Kobrin (2003) 


The results suggest that computerized and paper-and-pencil reading 
comprehension tests may be more cognitively similar than originally thought. 
The only significant difference between computerized and paper-and-pencil 
tests was in the frequency of identifying important information in the passage. 


Landau (2003) 


Students performed better on five of the eight items when using the Talking 
Tactile Tablet, and performed the same on the remaining three. Using the 
Talking Tactile Tablet also yielded item difficulties that more closely resembled 
the item difficulties obtained by general education students during testing. 


Macarthur (2004) 


The results indicate that two-thirds (68%) of the students achieved 85% 
accuracy and more than one-third (40%) achieved 90% accuracy using 
dictation to a scribe or speech recognition software. Only 3 students (10%) 
were below 80% accuracy. Results for adults have been reported between 
90% and 98%. Results also demonstrate that both dictation conditions helped 
students with learning disabilities produce better essays. Students with 
learning disabilities produced higher quality essays when using a scribe, then 
when using speech recognition software. Both adapted conditions were better 
in quality than handwritten essays. 


McKevitt (2003) 


The use of the read-aloud accommodation did not significantly improve the 
test performance of either group of students.Teachers as a group had neutral 
attitudes about testing and testing accommodations. 


Meloy (2002) 


Analyses revealed that students in both groups (LD-R and non-LD) achieved 
significantly higher test scores with the read aloud test administration. 


Nickerson (2004) 


The students felt that a majority of the accommodations used were helpful. 
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Pomplun (2002) 


Analyses indicated that both forms of the computerized versions produced 
higher vocabulary scores than the paper-and-pencil format and one-form also 
had higher comprehension and total scores on the computerized version. 
These differences appeared to be related to the differences in response 
speed associated with use of a mouse to record responses as opposed to 
a pencil and answer sheet. Scores on the paper-and-pencil version and the 
computerized version had similar predictive power for course-placement. 


Reed (2002) 


Instances of learned helplessness and low motivation, a problem for LD and 
AD/HD students, were observed. To aid these students, test makers must 
do the following: Be cautious with the context, Ask the question clearly, and 
Repeat key words when possible in the response options. 


Scheuneman (2002) 


Almost 95% of students brought calculators to the November administration 
of the examination in both years. About 65% used their calculators on one 
third or more of the items. Group differences in the use of calculators were 
detected with girls using calculators more frequently than boys and Whites 
and Asian Americans using them more often than other racial groups. 
Although calculator presence, frequency of use, and calculator type were 
all correlated with test scores, this relation appears to be the result of the 
more able students using calculators differently from the less able students. 
Regression analyses revealed that a small percentage of the variance in 
test scores was accounted for by calculator access and type of calculator. 
Differential item functioning analyses (DIF) showed items favoring both 
frequent use and little use of calculators. Data concerning the rate of 
completion provided evidence that those using calculators less often were 
more likely to complete the exam. 


Shriner (2003) 


In this intervention study, training was found to increase the quality and extent 
of participation and accommodation documentation on the IER Correlations 
between what was documented on the IEP and what happened on the day 
of testing were highly variable. Although students’ lEPs appeared to reflect 
individualized decisions, political and logistical factors limited the utility of the 
IEP and interfered with its actual implementation. 


Tavani (2004) 


Findings demonstrated non-significant performance score increases when 
students with learning disabilities who used accommodations were compared 
to those students who did not use accommodations. 


Thornton (2002) 


Overall, results suggest that LSAT scores earned under the nonstandard time 
condition are not comparable to LSAT scores earned under standard timing 
conditions. Results for individual subgroups were consistent with the overall 
group result. 


Tindal (2002) 


A main effect for both student classification and test administration 
was found for the elementary school students: Low achieving students 
outperformed students with lEPs and both groups benefited with a video- 
taped administration. For middle school students, a main effect was found 
for student classification; however, no main effects for the type of test 
administration (video versus standard). 
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Trammell (2003) 


The impact of special accommodations on the subgroups revealed a 
significant improvement in grades for students with ADD and students 
with Learning Disabilities and ADD, but a drop for students with Learning 
Disabilities. Students with ADD and ADD and Learning Disabilities 
experienced an increase in grades with all types of accommodations 
conversely, students with learning disabilities experienced a drop in grades 
with each accommodation. 


Weston (2003) 


The findings revealed a statistical difference between the tests, and also 
between the two groups of students. Students with learning disabilities who 
are poor readers gained the most from the read-aloud accommodation. 
Results also suggest that the results on the accommodated test better match 
the teacher’s estimations of the student's mathematical abilities. 


Woods (2004) 


The investigation found a low level of candidate need for a reader with 
candidate reading age and self-prediction being unreliable indicators of this 
need. 
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Appendix G — Summary of Limitations Cited by Researchers 



Author 


Small Sample Size/Sample Too Narrow in Scope 


Barton (2003) 


This study included mostly mild to moderately disabled students. A replication 
with participating students with more severe disabilities, particularly severe 
physical disabilities, would certainly be beneficial. 


Bridgeman (2003) 


Although this study provides evidence that issues of screen size and 
resolution cannot be ignored, even larger studies are needed to understand 
fully the separate roles of screen size and resolution. 


Burch (2002) 


The study is limited by the small sample sizes of the groups. So, it is possible 
the students were not representative of the entire population of fourth-graders 
because the sample sizes were small. 


Crawford (2004) 


Further research using larger samples is needed on the effects of extended 
time for students with learning disabilities in the upper grades. 


Hall (2002) 


The study focuses only on fifth grade students, and the results may not 
generalize to students with disabilities in other grades or dissimilar disabilities, 
socio-economic statues, etc. 


Helwig (2002) 


There were too few fifth-grade low-skill readers taking Form A in 
accommodated format to do meaningful analyses. 


Idstein (2003) 


The sample size was smaller than expected due to unpredictable attendance 
rates. 


Jackson (2003) 


Visual impairment is a low incidence disability and the number of possible 
participants is restricted. Many of the potential participants were eliminated 
because they were given the alternate assessment due to additional 
disabilities that affect student performance. 


Janson (2002) 


The study was conducted in a small school system in Tennessee. The study 
was limited to 448 students. Due to the small sample size of students who 
took the tests with accommodations it would be problematic to generalize the 
findings to a larger population. 


Kappel (2002) 


The sample size of 11 to 12 per group was possibly insufficient to detect 
effects that were present. 


Kettler (2003) 


One potential limitation of this study is that we examined only two test content 
areas and two grade levels. Specifically, we examined only mathematics and 
reading, although students are tested in science and social studies as well. 


Landau (2003) 


The small sample limited the analysis of the impact of the test 
accommodations on the psychometric properties of items. 


McKevitt (2003) 


This study focused solely on reading. Although the question of interest 
focused on a read-aloud accommodation, it was addressed only in the 
context of a reading test. 


Pomplun (2002) 


The ability to draw generalizations from these results could be limited by 
having students from only seven schools participate in the study. 


Tavani (2004) 


This study was intentionally limited with respect to its population and 
dependent variable. The target population was restricted to students in the 
4 th , 8 th , and 12 th grades in the United States in 2000; consequently, this study 
may have sacrificed on its level of external validity. 
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Author 


Conflicting Results 


Tindal (2004) 


Much of this research is tentative with conflicting overall test results: 
some findings show positive effects for all students, other findings reflect 
interactions between an accommodation and a population. 


Weston (2003) 


For non-disabled students the evidence is mixed and may be flawed by 
methodological problems. First, very low readers in the regular classroom 
did not seem to profit from the accommodation. Second, item content did not 
seem to affect general education students. Third, these students performed 
better on a number of items in the paper and pencil format. 


Author 


Nonstandard Administration Across Proctors and Schools 


Huynh (2002) 


South Carolina High School Exit Examination tests are un-timed; hence all 
students are permitted to take as along as they need to complete the tests 
regardless of whether they are disabled or not. Therefore, the findings of this 
study are not applicable to test administration modes that involve extended- 
time accommodations. 


Huynh (2004) 


Different school authorities made IEP and 504 accommodation decisions 
across grades 8 and 10; therefore, it is conceivable that a subset of this 
population should have been tested under oral administration at grade 8. 


Author 


Confounding Factors 


Barton (2002) 


Some students may be more accustomed to receiving the oral 
accommodation in their daily instruction and may therefore be practiced in 
test taking forms or environments that involve a good deal of listening. 


Barton (2003) 


A confound in this study was that students were not randomly assigned to 
discourse type nor to specific prompts within discourse type. 


Cahalan (2002) 


Some of these variations may be due to different populations of students 
used in the second sample. In sample two, colleges and universities were 
permitted to omit students for any reason including receiving services for a 
learning disability and having a FGPA less than 1 .0. 


Elbaum (2004) 


The confounding of the accommodation with concomitant factors such as self- 
pacing and individual administration was a serious limitation. 


Elliott (2004) 


The students had more than enough time in the standard time condition likely 
diminished the impact of the accommodation of extra time. Also, nearly all 
students with disabilities receive multiple accommodations on district and 
statewide tests, thus the extended time accommodation when provided in 
isolation is contrived and not realistic. 


Gagnon (2004) 


Two limitations exist with the current study: (a) low response rate; and (b) 
differences in the characteristics of respondents versus nonrespondents. 


Kobrin (2003) 


An important limitation of this study is the lack of a time limit imposed on 
participants, because actual testing situations include time limits. 


Macarthur (2004) 


It is important to keep in mind that this study did not include extensive training 
in the use of speech recognition. Students received approximately 6 hours of 
individual instruction on training the software to recognize their speech and 
using it to compose essays. 


Meloy (2002) 


The read-aloud administration did not permit student self-pacing, and this 
procedure could have had an impact on students; maintaining attention to the 
test. Moreover, the administration was done by reading scripts for the various 
testings, whereas using a prerecorded tape could have reduced possible 
varying reader emphases. 
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Nickerson (2004) 


Variables other than language proficiency and accommodations may affect 
performance on the SAT-9 (e.g. program model, access to curriculum, 
instruction, test preparation, etc.). 


Reed (2002) 


The use of the think-aloud procedure in this research may have in itself been 
beneficial for student performance. The process of self-explanation has been 
shown to improve students’ problem-solving performance. 


Trammell (2003) 


Students with learning disabilities or learning disabilities plus ADD were not 
well matched with the accommodations they selected and were granted. This 
pitfall was the first and foremost limitation addressed in the design of the 
experiment and requires much further refinement and investigation. 


Author 


Flaw in Research Design 


Bolt (2004) 


This was not an experimental study, and there are subsequently limits to the 
inferences that can be made. 


Cisar (2004) 


The questionnaire that was developed for this study was not tested for 
construct or content validity. The validity of the test will indicate if it measured 
what it presumed to measure thus making the instrument more meaningful. 


Elliott (2003) 


When given twice as much time to work on the test, students neither took 
advantage of the extra time, nor showed significant gains in their scores. The 
students had more than enough time to complete the test diminishing the 
impact of the extra time accommodation. 


Tindal (2002) 


The sampling plan of the study was neither random nor stratified for teachers 
or students. Rather, teachers had been nominated for participation based on 
personal contacts of state department personnel through their own networks 
with principals and others in the local educational agencies. 


Author 


No Limitations Mentioned 


Bielinski (2003) 




Bridgeman (2004) 




Choi (2002) 




Dempsey (2003) 




Hansen (2002) 




Helwig (2003) 




Katzman (2004) 




Schenueman (2002) 




Shriner (2003) 




Thornton (2002) 




Woods (2004) 




Author 


Not Applicable/Meta-analysis 


Sired (2003) 




Tindal (2004) 
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Appendix H — Summary of Suggestions for Future Research (as 
recommended by authors) 



Author 


Investigate characteristics of accommodations themselves in further 
detail. 


Barton (2003) 


It is important to continue the research in the area of comparability of 
assessments that are administered online or on paper, both on scoring 
comparability and on the performance comparability of all students. 


Bielinski (2003) 


The findings in this study point to the need for further dialogue and more 
research on test accommodations. The opinions of those who influence 
policy and who are familiar with test accommodations vary too much to 
ignore. 


Elbaum (2004) 


The effects of different components of the accommodation need to be 
assessed separately. 


Elliott (2004) 


Researchers need extra time to answer this question: It is difficult to 
determine what a lack of boost in scores on accommodated test conveys 
about the effectiveness of the accommodation — specifically, did the 
accommodation provide access to the test so that the student’s true ability 
was assessed, or did the accommodation itself negatively affect the students’ 
performance? 


Janson (2002) 


Further research studies should be conducted in other Tennessee school 
systems to determine if accommodations, as provided in Tennessee, “level 
the playing field.” 


Kappel (2002) 


This study might also be expanded to include other factors, other 
accommodations. Also the effects of the Extended Time and Read-Aloud 
accommodation when administered in a group setting should also be 
investigated. 


Kobrin (2003) 


Future research should highlight the ways in which computerized tests may 
completely replace paper-and-pencil tests. 


McKevitt (2003) 


Future research examining the differential impact that decoding, fluency, or 
comprehension difficulties may have on reading test performance and the 
effects of accommodations would be useful. 


Meloy (2002) 


Future research on the read aloud accommodation is needed. Additional 
aspects of the read aloud procedures should be studied, and further 
refinements in design and sampling would be helpful. 


Nickerson (2004) 


An investigation into content validity, consequential validity and possible test 
bias is warranted for English language learners being assessed for academic 
achievement in the content areas of Reading/Langauge Arts and Mathematics 
with the SAT-9 tests. 


Tavani (2004) 


An investigation of a similar model utilizing NAEP databases is warranted in 
order to examine the impact these characteristics have on differing subject 
performances. 


Weston (2003) 


For any policy decision that contemplates providing students with 
accommodations, more research should be done to learn if the patterns 
shown in this study can be reproduced. 
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Author 


Investigate student factors contributing to accommodations use. 


Bridgeman (2003) 


More research is needed to investigate the effects of a high stakes test on 
the psychological state of computer scrolling test takers. This study did not 
address that issue. 


Crawford (2004) 


Follow-up research studies investigating student’s use of time during writing 
assessments will provide researchers with information related to differences 
across grades and educational classifications. The stakes associated with 
large-scale testing are too high to ignore the need for empirical evidence 
supporting the validity of multiple-day writing assessments. 


Idstein (2003) 


An observation based on interview data and subsequently discussed with 
classroom teachers, points to a possible correlation between general 
personality traits and dictionary use. This topic may warrant further 
investigation. 


Katzman (2004) 


Future research should examine the amount of support required to help 
students stay motivated. 


Scheuneman (2002) 


Differences in students’ approach to problems when using calculators is an 
area where further investigation would be required. 


Trammell (2003) 


Students with learning disabilities in addition to ADD exhibited erratic decision 
making regarding accommodation requests. They may have been ill matched 
to the accommodation used during test taking. This pitfall requires much 
further refinement and investigation. 


Author 


Improved Study Design or Study Replication 


Cahalan (2002) 


More research is needed to investigate which factors contribute to the varied 
correlations in this study. 


Helwig (2002) 


The poor performance of some of the participants was, in fact, due to the 
distraction of the video when it was not needed, a logical solution would be 
an on-demand delivery system. A computer, audiotape, or live reading of 
only items selected by students on an individual basis would likely solve this 
problem. Further research in this area is warranted. 


Landau (2003) 


It is strongly suggested that in future studies participants be allowed to work 
with the Talking Tactile Tablet prior to testing and that more thorough beta- 
testing should be performed prior to testing. 


Tindal (2002) 


Further research may begin to utilize a videotaped administration along 
with other, more powerful and individualized accommodations that deal with 
setting and time. 


Tindal (2004) 


An effort needs to be made to improve consistency and systematicity in both 
practice and research, while maintaining clarity. 


Author 


Study policy and accommodations hypotheses. 


Bolt (2004) 


Research should continue to investigate accommodation decision-making 
and administration practices as well as Universal Design. 


Gagnon (2004) 


The results of this study indicate a need for more research concerning how 
the policies related to increased accountability are being implemented in 
special schools. 


Reed (2002) 


Additional clarification is needed for discussing the role of item or test intent 
in determining the appropriateness of an accommodation. Clear definitions 
of the constructs a given test measures should inform the practice of granting 
accommodations. 
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Sireci (2003) 


Due to a wide variety of results stemming from experiential research it is 
suggested that a revision of the interaction hypotheses be proposed and 
that directions for future research and for improved test development and 
administration practices be proposed. 


Author 


Replicate study with larger sample. 


Burch (2002) 


Future studies should address the limitations of this study as well as the other 
studies in this area. The small sample sizes in this study create a need for 
replication. 


Jackson (2003) 


This study should be replicated to validate the results, with a larger sample 
across a variety of states, classifying adaptations into components identified 
in this study, and with more comparison to assessment results of non- 
disabled students. 


Pomplun (2002) 


Because the present study was based on students from only seven 
institutions, research studies should be continued, especially to support 
predictive validity of the computerized versions. 


Author 


Investigate teacher factors related to accommodations selections. 


Cisar (2004) 


Replication of this study in other states may be beneficial to determine if 
teachers can distinguish and know when to use adaptations nationwide. 


Kettler (2003) 


More research is needed on the apparently highly individualized nature of 
the impact of testing accommodations. What factors influence educators’ 
selection of testing accommodations for specific students? 


Author 


Investigate possible instructional uses for accommodations. 


Barton (2002) 


Future analyses that seek to guide instructional effectiveness of students 
will augment this research. It would be interesting to look at each item 
that loaded on particular factors to see what qualitative characteristic 
they possess. If commonalities exist across items, such information may 
supplement instructional level information and types of approaches teachers 
may take with students. 


Author 


Investigate Accommodation Practicality 


Macarthur (2004) 


Future research should investigate the practical issues involved in using 
speech recognition in school settings and the impact of use over an extended 
time. 


Author 


No Suggestions for Future Research Mentioned 


Barton (2003) 




Bridgeman (2004) 




Choi (2002) 




Dempsey (2003) 




Elliott (2003) 




Hall (2002) 




Hansen (2002) 




Hansen (2002) 




Helwig (2003) 




Huynh (2002) 




Huynh (2004) 




Shriner (2003) 





NCEO 



69 







70 



NCEO 





